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CLAIMS 

We claim: 

1 LA method for identifying similar bugs, comprising: 

2 generating a database that contains database tokens that relate to identified bugs; 

3 generating input tokens associated with a bug in question; 

4 scanning the database for occurrences of the input tokens; and 

5 determining an overall probability as to whether the identified bugs are the same as 

6 the bug in question. 

1 2. The method of claim 1, wherein generating a database comprises generating a 

2 derivative database from a bug database that contains failing results files. 

1 3. The method of claim 2, wherein generating a derivative database comprises 

2 generating database tokens from character strings of the failing results files. 

1 4. The method of claim 3, wherein generating database tokens comprises 

2 generating tokens for character strings that are proximate to the term "error" in the failing 

3 results files. 

1 5. The method of claim 3, wherein generating database tokens comprises 

2 generating tokens for character strings that comprise at least one of letters, numbers, and 

3 underscores. 
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6. The method of claim 3, wherein generating database tokens further comprises 
noting the number of times each token occurs relative to each bug of the database. 



1 7. The method of claim 1, wherein generating input tokens comprises generating 

2 tokens from character strings of an input failing results file of the bug in question. 

1 8. The method of claim 1, wherein scanning the database comprises scanning the 

2 tokens of the database to identify matches for the input tokens. 

1 9. The method of claim 1, wherein scanning the database further comprises 

2 identifying the number of occurrences of each input token in the database relative to each bug 

3 of the database. 



1 10. The method of claim 1, wherein determining the overall probability comprises 

2 summing the total number of occurrences of each input token in the database and normalizing 

3 the total number of occurrences of each input token as to each bug of the database. 



1 11. The method of claim 10, wherein determining the overall probability further 

2 comprises scaling normalized values that result from the normalizing to obtain scaled 

3 probabilities as to each input token relative to each bug of the database. 

1 12. The method of claim 11, wherein determining the overall probability further 

2 comprises determining the standard deviance for each scaled probability and removing bug 

3 tokens from consideration that are associated with an input token having a deviance below a 

4 predetermined minimum deviance. 
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1 13. The method of claim 12, wherein determining the overall probability further 

2 comprises determining the overall probability as to all bugs using the scaled probabilities 

3 associated with those bugs. 

1 14. The method of claim 13, wherein determining the overall probability as to all 

2 bugs comprises applying Bayes' Theorem to the scaled probabilities to calculate the overall 

3 probability for each bug as being the same bug as the bug in question. 

1 15. A system for identifying similar bugs, comprising: 

2 means for generating input tokens associated with a bug in question; 

3 means for scanning a database that associates bugs with database tokens pertaining to 

4 bugs for occurrences of the input tokens; and 

5 means for determining an overall probability for each bug of the database of being the 

6 same bug as the bug in question. 

1 16. The system of claim 15, wherein the means for generating input tokens 

2 comprise means for generating tokens from character strings of an input failing results file for 

3 the bug in question. 

1 17. The system of claim 15, wherein the means for scanning a database comprise 

2 means for scanning the database tokens to identify matches for the input tokens and means 

3 for identifying the number of occurrences of the input tokens in the database relative to each 

4 potential bug. 



16 



HP Docket No.: 200208663-1 



1 18. The system of claim 15, wherein the means for determining the overall 

2 probability comprise means for determining a probability that a bug is the same relative to 

3 each database token associated with the bug. 

1 19. The system of claim 18, wherein the means for determining the overall 

2 probability further comprise means for applying Bayes' Theorem to those probabilities to 

3 calculate the overall probability for each bug as being the bug in question. 

1 20. The system of claim 15, further comprising means for generating the database 

2 from failing results files contained in a bug database. 

1 2 1 . A system stored on a computer-readable medium, the system comprising: 

2 logic configured to generate a database that associates bugs with tokens derived from 

3 failing results files of the bugs; 

4 logic configured to generate input tokens from an input that describes a bug in 

5 question; 

6 logic configured to identify the number of occurrences of each of the input tokens in 

7 the database as per each potential bug; and 

8 logic configured to determine an overall probability of each bug being the same as the 

9 bug in question relative to the number of occurrences. 

1 22. The system of claim 21, wherein the logic configured to generate input tokens 

2 is configured to generate tokens from character strings of an input failing results file. 
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1 23. The system of claim 21, wherein the logic configured to determine the overall 

2 probability is configured to determine probabilities as to each bug relative to database tokens 

3 associated with those bugs. 

1 24. The system of claim 23, wherein the logic configured to determine the overall 

2 probability is further configured to apply Bayes' Theorem to the determined probabilities to 

3 calculate the overall probability for each bug of being the bug in question. 

1 25. A bug similarity system stored on a computer-readable medium, the system 

2 comprising: 

3 a derivative database generator that is configured to generate a derivative database 

4 that contains a plurality of database tokens that are associated with identified bugs; and 

5 an similarity calculator that is configured to: 

6 generate input tokens from an input that describes a bug in question, 

7 determine the number of occurrences of the input tokens in the derivative 

8 database relative to each bug, 

9 determine the probability of each bug being the same bug as the bug in 

1 0 question relative to each input token, and 

1 1 calculate an overall probability of each bug being the same bug as the bug in 

12 question using the determined probabilities. 

1 26. The system of claim 25, wherein the derivative database generator is 

2 configured to generate the database tokens from character strings contained in failing results 

3 files of a bug database. 
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1 27. The system of claim 25, wherein the similarity calculator is configured to 

2 calculate the overall probability by applying Bayes' Theorem to the determined probabilities. 

1 28. A computer system, comprising: 

2 a processing device; and 

3 a memory that comprises a bug similarity system, the bug similarity system being 

4 configured to generate a first set of tokens for each of several bugs, generate input tokens 

5 from an input that describes a bug in question, determine the number of occurrences of the 

6 input tokens in the first sets of tokens, determine the probability as to each of the bugs of 

7 whether each bug is the same bug as the bug in question relative to each input token, and 

8 calculate an overall probability as to whether the bugs are the same bug as the bug in question 

9 using the determined probabilities. 

1 29. The system of claim 28, wherein the bug similarity system is configured to 

2 calculate the overall probability by applying Bayes' Theorem to the determined probabilities. 
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